The report explores data set on the chemical properties of the wine.
## Observations: 1,599
## Variables: 13
## $ X <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13...
## $ fixed.acidity <dbl> 7.4, 7.8, 7.8, 11.2, 7.4, 7.4, 7.9, 7.3, ...
## $ volatile.acidity <dbl> 0.700, 0.880, 0.760, 0.280, 0.700, 0.660,...
## $ citric.acid <dbl> 0.00, 0.00, 0.04, 0.56, 0.00, 0.00, 0.06,...
## $ residual.sugar <dbl> 1.9, 2.6, 2.3, 1.9, 1.9, 1.8, 1.6, 1.2, 2...
## $ chlorides <dbl> 0.076, 0.098, 0.092, 0.075, 0.076, 0.075,...
## $ free.sulfur.dioxide <dbl> 11, 25, 15, 17, 11, 13, 15, 15, 9, 17, 15...
## $ total.sulfur.dioxide <dbl> 34, 67, 54, 60, 34, 40, 59, 21, 18, 102, ...
## $ density <dbl> 0.9978, 0.9968, 0.9970, 0.9980, 0.9978, 0...
## $ pH <dbl> 3.51, 3.20, 3.26, 3.16, 3.51, 3.51, 3.30,...
## $ sulphates <dbl> 0.56, 0.68, 0.65, 0.58, 0.56, 0.56, 0.46,...
## $ alcohol <dbl> 9.4, 9.8, 9.8, 9.8, 9.4, 9.4, 9.4, 10.0, ...
## $ quality <int> 5, 5, 5, 6, 5, 5, 5, 7, 7, 5, 5, 5, 5, 5,...
## X fixed.acidity volatile.acidity citric.acid
## Min. : 1.0 Min. : 4.60 Min. :0.1200 Min. :0.000
## 1st Qu.: 400.5 1st Qu.: 7.10 1st Qu.:0.3900 1st Qu.:0.090
## Median : 800.0 Median : 7.90 Median :0.5200 Median :0.260
## Mean : 800.0 Mean : 8.32 Mean :0.5278 Mean :0.271
## 3rd Qu.:1199.5 3rd Qu.: 9.20 3rd Qu.:0.6400 3rd Qu.:0.420
## Max. :1599.0 Max. :15.90 Max. :1.5800 Max. :1.000
## residual.sugar chlorides free.sulfur.dioxide
## Min. : 0.900 Min. :0.01200 Min. : 1.00
## 1st Qu.: 1.900 1st Qu.:0.07000 1st Qu.: 7.00
## Median : 2.200 Median :0.07900 Median :14.00
## Mean : 2.539 Mean :0.08747 Mean :15.87
## 3rd Qu.: 2.600 3rd Qu.:0.09000 3rd Qu.:21.00
## Max. :15.500 Max. :0.61100 Max. :72.00
## total.sulfur.dioxide density pH sulphates
## Min. : 6.00 Min. :0.9901 Min. :2.740 Min. :0.3300
## 1st Qu.: 22.00 1st Qu.:0.9956 1st Qu.:3.210 1st Qu.:0.5500
## Median : 38.00 Median :0.9968 Median :3.310 Median :0.6200
## Mean : 46.47 Mean :0.9967 Mean :3.311 Mean :0.6581
## 3rd Qu.: 62.00 3rd Qu.:0.9978 3rd Qu.:3.400 3rd Qu.:0.7300
## Max. :289.00 Max. :1.0037 Max. :4.010 Max. :2.0000
## alcohol quality
## Min. : 8.40 Min. :3.000
## 1st Qu.: 9.50 1st Qu.:5.000
## Median :10.20 Median :6.000
## Mean :10.42 Mean :5.636
## 3rd Qu.:11.10 3rd Qu.:6.000
## Max. :14.90 Max. :8.000
There are 1,599 observations and 12 variables. Note that X represents the numbering of the sample and not the variable by itself. Quality is the output variable that we will be exploring.
Lets drop unused variable X.
I am also creating a new varible called rating which will split wine into 3 categories: bad, good and excellent.
Prior to proceding to plots we cleaned up the data by 1) transforming quality
from integer to ordered factor, 2) dropping variable X which is just a numbering of the data and 3) creating a new variable called rating (bad, average, excellent).
Majority of red wine sample got rating of 5 and 6. There are no observations were red wine got 1,2 rating or 9,10 rating.
When we plot data by rating we can see that majority of wine falls into “good” wine category with quality score of 5-6 and next up comes excellent category (scores equal to and higher than 7). Only a small proportion of wine falls into “bad” rating category.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.60 7.10 7.90 8.32 9.20 15.90
Most red wines have fixed acidity ranging between 7 to 9. There are some outliers which range up to 15.90. Median value of red wines is 7.90.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1200 0.3900 0.5200 0.5278 0.6400 1.5800
Most red wines have volatile acidity ranging between 0.39 to 0.64. There are some outliers which range up to 1.58.
We see that the data is skewed to the right, lets remove outliers.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.090 0.260 0.271 0.420 1.000
Citric acid values range from 0 to 1, although 75% of the population falls under 0.42.
Residual sugar values tends to vary significantly thoughout the sample population. It would be interesting to see if there is any correlation with quality of wine. Residual sugar values range from 0 to 16, however zooming in we see that most red wines have residual sugar of 1.5 to 2.5.
Another chemical property skewed to the right is chlorides. Eliminating those outliers and zooming in we can see that most red wines have chlorides with values of 0.05 to 0.09.
Similar to chlorides and residual sugar, free sulfur dioxide levels also skewed to the right. There seems to be a pattern here and it would be interesting to later analyze if there is a correlation between these 3 variables. Are the same samples appearing as outliers in plots for all three (chlorides, residual sugar, free sulfur dioxide)?
Zooming into free sulfur dioxide we see that majority of wine range from 3 to 40.There is a spike at values 5 and 6, which will be interesting to explore later.
Most red wine in the sample have total sulfur dioxide that ranges from 10 to 50.
Density and pH values look normally distributed, most wines having density level from 0.995 to 1 and pH ranging from 3.1 to 3.5.
Sulphates are skewed to the right with most wines ranging between 0.5 and 0.7.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8.40 9.50 10.20 10.42 11.10 14.90
Alcohol level of wines in population has a mean of 10.42 and median of 10.20. About 75% of wine have alcohol level lower than 11.10.
## fixed.acidity volatile.acidity citric.acid residual.sugar
## Min. : 4.900 Min. :0.1200 Min. :0.0000 Min. :1.200
## 1st Qu.: 7.400 1st Qu.:0.3000 1st Qu.:0.3000 1st Qu.:2.000
## Median : 8.700 Median :0.3700 Median :0.4000 Median :2.300
## Mean : 8.847 Mean :0.4055 Mean :0.3765 Mean :2.709
## 3rd Qu.:10.100 3rd Qu.:0.4900 3rd Qu.:0.4900 3rd Qu.:2.700
## Max. :15.600 Max. :0.9150 Max. :0.7600 Max. :8.900
## chlorides free.sulfur.dioxide total.sulfur.dioxide
## Min. :0.01200 Min. : 3.00 Min. : 7.00
## 1st Qu.:0.06200 1st Qu.: 6.00 1st Qu.: 17.00
## Median :0.07300 Median :11.00 Median : 27.00
## Mean :0.07591 Mean :13.98 Mean : 34.89
## 3rd Qu.:0.08500 3rd Qu.:18.00 3rd Qu.: 43.00
## Max. :0.35800 Max. :54.00 Max. :289.00
## density pH sulphates alcohol quality
## Min. :0.9906 Min. :2.880 Min. :0.3900 Min. : 9.20 3: 0
## 1st Qu.:0.9947 1st Qu.:3.200 1st Qu.:0.6500 1st Qu.:10.80 4: 0
## Median :0.9957 Median :3.270 Median :0.7400 Median :11.60 5: 0
## Mean :0.9960 Mean :3.289 Mean :0.7435 Mean :11.52 6: 0
## 3rd Qu.:0.9973 3rd Qu.:3.380 3rd Qu.:0.8200 3rd Qu.:12.20 7:199
## Max. :1.0032 Max. :3.780 Max. :1.3600 Max. :14.00 8: 18
## rating
## bad : 0
## good : 0
## excellent:217
##
##
##
## fixed.acidity volatile.acidity citric.acid residual.sugar
## Min. : 4.600 Min. :0.2300 Min. :0.0000 Min. : 1.200
## 1st Qu.: 6.800 1st Qu.:0.5650 1st Qu.:0.0200 1st Qu.: 1.900
## Median : 7.500 Median :0.6800 Median :0.0800 Median : 2.100
## Mean : 7.871 Mean :0.7242 Mean :0.1737 Mean : 2.685
## 3rd Qu.: 8.400 3rd Qu.:0.8825 3rd Qu.:0.2700 3rd Qu.: 2.950
## Max. :12.500 Max. :1.5800 Max. :1.0000 Max. :12.900
## chlorides free.sulfur.dioxide total.sulfur.dioxide
## Min. :0.04500 Min. : 3.00 Min. : 7.00
## 1st Qu.:0.06850 1st Qu.: 5.00 1st Qu.: 13.50
## Median :0.08000 Median : 9.00 Median : 26.00
## Mean :0.09573 Mean :12.06 Mean : 34.44
## 3rd Qu.:0.09450 3rd Qu.:15.50 3rd Qu.: 48.00
## Max. :0.61000 Max. :41.00 Max. :119.00
## density pH sulphates alcohol quality
## Min. :0.9934 Min. :2.740 Min. :0.3300 Min. : 8.40 3:10
## 1st Qu.:0.9957 1st Qu.:3.300 1st Qu.:0.4950 1st Qu.: 9.60 4:53
## Median :0.9966 Median :3.380 Median :0.5600 Median :10.00 5: 0
## Mean :0.9967 Mean :3.384 Mean :0.5922 Mean :10.22 6: 0
## 3rd Qu.:0.9977 3rd Qu.:3.500 3rd Qu.:0.6000 3rd Qu.:11.00 7: 0
## Max. :1.0010 Max. :3.900 Max. :2.0000 Max. :13.10 8: 0
## rating
## bad :63
## good : 0
## excellent: 0
##
##
##
## fixed.acidity volatile.acidity citric.acid residual.sugar
## Min. : 4.700 Min. :0.1600 Min. :0.0000 Min. : 0.900
## 1st Qu.: 7.100 1st Qu.:0.4100 1st Qu.:0.0900 1st Qu.: 1.900
## Median : 7.800 Median :0.5400 Median :0.2400 Median : 2.200
## Mean : 8.254 Mean :0.5386 Mean :0.2583 Mean : 2.504
## 3rd Qu.: 9.100 3rd Qu.:0.6400 3rd Qu.:0.4000 3rd Qu.: 2.600
## Max. :15.900 Max. :1.3300 Max. :0.7900 Max. :15.500
## chlorides free.sulfur.dioxide total.sulfur.dioxide
## Min. :0.03400 Min. : 1.00 Min. : 6.00
## 1st Qu.:0.07100 1st Qu.: 8.00 1st Qu.: 24.00
## Median :0.08000 Median :14.00 Median : 40.00
## Mean :0.08897 Mean :16.37 Mean : 48.95
## 3rd Qu.:0.09100 3rd Qu.:22.00 3rd Qu.: 65.00
## Max. :0.61100 Max. :72.00 Max. :165.00
## density pH sulphates alcohol quality
## Min. :0.9901 Min. :2.860 Min. :0.3700 Min. : 8.40 3: 0
## 1st Qu.:0.9958 1st Qu.:3.210 1st Qu.:0.5400 1st Qu.: 9.50 4: 0
## Median :0.9968 Median :3.310 Median :0.6100 Median :10.00 5:681
## Mean :0.9969 Mean :3.311 Mean :0.6473 Mean :10.25 6:638
## 3rd Qu.:0.9979 3rd Qu.:3.400 3rd Qu.:0.7000 3rd Qu.:10.90 7: 0
## Max. :1.0037 Max. :4.010 Max. :1.9800 Max. :14.90 8: 0
## rating
## bad : 0
## good :1319
## excellent: 0
##
##
##
There are 1,599 observations in the dataset with 12 variables. At least 3 wine experts rated the quality of each wine, providing a rating between 0 (very bad) and 10 (very excellent). Variables in the dataset are:
1 - fixed acidity: most acids involved with wine or fixed or nonvolatile (do not evaporate readily).
2 - volatile acidity: the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste.
3 - citric acid: found in small quantities, citric acid can add ‘freshness’ and flavor to wines.
4 - residual sugar: the amount of sugar remaining after fermentation stops, it’s rare to find wines with less than 1 gram/liter and wines with greater than 45 grams/liter are considered sweet.
5 - chlorides: the amount of salt in the wine.
6 - free sulfur dioxide: the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine.
7 - total sulfur dioxide: amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine.
8 - density: the density of water is close to that of water depending on the percent alcohol and sugar content.
9 - pH: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale.
10 - sulphates: a wine additive which can contribute to sulfur dioxide gas (S02) levels, wich acts as an antimicrobial and antioxidant.
11 - alcohol: the percent alcohol content of the wine.
Output variable (based on sensory data):
12 - quality (score between 0 and 10)
Main feature of interest in the dataset is the quality of wine and other variables which directly or in collaboration with other characteristics impact the quality.
Comparing qualities of bad, good and excellent wines, volatile.acidity and citric.acid differs significantly for each rating and therefore hinting those are the qualities impacting the quality of wine. Another characteristic that would be interesting to explore is the level of alcohol and its impact on quality of wine.
I created a rating variable to group the population into 3 categories: bad, good and excellent.
of the data? If so, why did you do this?
Chlorides, residual sugar and free sulfur dioxide levels are all skewed to the right. There seems to be a pattern here and it would be interesting to later analyze if there is a correlation between these 3 variables. Are the same samples appearing as outliers in plots for all three (chlorides, residual sugar, free sulfur dioxide)?
I deleted the variable X from the dataset which was just a numbering of sample and not the characteristic of wine. I also transformed quality from an integer to an ordered factor.
## fixed.acidity volatile.acidity citric.acid
## fixed.acidity 1.00 -0.26 0.67
## volatile.acidity -0.26 1.00 -0.55
## citric.acid 0.67 -0.55 1.00
## residual.sugar 0.11 0.00 0.14
## chlorides 0.09 0.06 0.20
## free.sulfur.dioxide -0.15 -0.01 -0.06
## total.sulfur.dioxide -0.11 0.08 0.04
## density 0.67 0.02 0.36
## pH -0.68 0.23 -0.54
## sulphates 0.18 -0.26 0.31
## alcohol -0.06 -0.20 0.11
## quality 0.12 -0.39 0.23
## residual.sugar chlorides free.sulfur.dioxide
## fixed.acidity 0.11 0.09 -0.15
## volatile.acidity 0.00 0.06 -0.01
## citric.acid 0.14 0.20 -0.06
## residual.sugar 1.00 0.06 0.19
## chlorides 0.06 1.00 0.01
## free.sulfur.dioxide 0.19 0.01 1.00
## total.sulfur.dioxide 0.20 0.05 0.67
## density 0.36 0.20 -0.02
## pH -0.09 -0.27 0.07
## sulphates 0.01 0.37 0.05
## alcohol 0.04 -0.22 -0.07
## quality 0.01 -0.13 -0.05
## total.sulfur.dioxide density pH sulphates alcohol
## fixed.acidity -0.11 0.67 -0.68 0.18 -0.06
## volatile.acidity 0.08 0.02 0.23 -0.26 -0.20
## citric.acid 0.04 0.36 -0.54 0.31 0.11
## residual.sugar 0.20 0.36 -0.09 0.01 0.04
## chlorides 0.05 0.20 -0.27 0.37 -0.22
## free.sulfur.dioxide 0.67 -0.02 0.07 0.05 -0.07
## total.sulfur.dioxide 1.00 0.07 -0.07 0.04 -0.21
## density 0.07 1.00 -0.34 0.15 -0.50
## pH -0.07 -0.34 1.00 -0.20 0.21
## sulphates 0.04 0.15 -0.20 1.00 0.09
## alcohol -0.21 -0.50 0.21 0.09 1.00
## quality -0.19 -0.17 -0.06 0.25 0.48
## quality
## fixed.acidity 0.12
## volatile.acidity -0.39
## citric.acid 0.23
## residual.sugar 0.01
## chlorides -0.13
## free.sulfur.dioxide -0.05
## total.sulfur.dioxide -0.19
## density -0.17
## pH -0.06
## sulphates 0.25
## alcohol 0.48
## quality 1.00
Before I start bivariate plots analysis I would like to run ggpairs to see the relationship between different variables.
Lets first explore what qualities of wine correlate with quality. From the above correlation matrix, we see that quality has the highest correlation with Alcohol and negative correlation with volatile acidity.
The box plot of relationship of quality vs alcohol is quite interesting - it appears that higher quality wines have higher percent of alcohol content.
Box plot of quality against volatile acidity shows that normally lower quality wines have higher level of volatile acidity and vice versa higher quality wines have lower level of volatile acidity. This makes sense because volatile acidity is the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste.
There is a slight correlation of Quality vs citric.acid and it appears that higher quality wines have slightly higher level of citric acid. This also makes sense given that citric acid can add ‘freshness’ and flavor to wines.
pH describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale. The above boxplot shows that wines in our sample are mainly within the 3-4 range. In general, better quality wines tend to have a lower pH scale, with some outliers of course.
Next lets explore which chemical qualities alcohol and volatile.acidity are correlated with.
Alcohol has high correlation with Density, so lets plot that first.
There is negative correlation between density and alcohol which makes sense given that the density of water is close to that of water depending on the percent alcohol and sugar content.
Per correlation matrix volatile.acidity has the highest correlation with citric.acid, lets plot to see how that relationship looks like.
There is slightly negative relationship between volatile.acidity and citric.acid. The higher levels of citric.acid is associated with lower level of volatile.acidity.
Per above the higher the level of fixed.acidity the higher is the level of density.
Citric.acid is positively correlated with density level. Higher citric.acid indicates higher density.
There is positive correlation between fixed.acidity and citric.acid levels.
It appears pH and fixed.acidity have high negative correlation. pH describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale. Per chart above wines with lower levels of fixed.acidity are less acidic.
Per chart higher level of citric.acid are associated with lower pH, i.e. higher citric.acid means less acidic.
My main feature of interest in the above analysis was what variables were associated with quality of wine. I noticed that quality is has positive correlation with alcohol and negative correlation with volatile acidity.
1. It appears that higher quality wines have higher percent of alcohol content.
2. Box plot of quality against volatile acidity shows that lower quality wines have higher level of volatile acidity and higher quality wines have lower level of volatile acidity. This makes sense because volatile acidity is the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste.
3. There is a slight correlation of Quality vs citric.acid and it appears that higher quality wines have slightly higher level of citric acid. This also makes sense given that citric acid can add ‘freshness’ and flavor to wines.
4. pH vs Quality shows that wines in our sample are mainly within the 3-4 range. In general, better quality wines tend to have a lower pH scale, with some outliers of course.
In my analysis I explored relationships for those variables which appeared to have the highest correlation indices. There were some more obvious observations like negative correlation between density and alcohol and negative relationship between volatile.acidity and citric.acid.
However, the relationship I found interesting was between fixed.acidity and other variables. There is positive correlation between fixed.acidity and citric.acid levels and citric.acid known to bring freshness and flavor to wine. While pH and fixed.acidity have high negative correlation. Wines with lower levels of fixed.acidity appear less acidic. Also, higher the level of fixed.acidity the higher is the level of density.
Fixed.acidity and pH is the strongest relationship I found.
This is interesting. We can see that high alcohol content and high citric.acid provide high quality wine, while the low alcohol content and low citric.acid get the low quality score. Interestingly, citric.acid impacts quality score more than the alcohol content. We can observe this for some high quality wines which have lower alcohol content but high citric.acid.
Plotting alcohol against volatile.acidity (the two variables highly correlated with quality), we see the expected tendency: the higher alcohol and low volatile.acidity results in higher quality. However, we see some outliers where high alcohol content and higher volatile.acidity results in lower quality score vs lower alcohol content and lower volatile.acidity.
Lets now plot citric.acid_vs_volatile.acidity and split by quality and see what happens.
What we see here is that low citric.acid and high volatile.acidity results in low quality score, which is expected based on our observations above. However, wine with medium volatile.acidity and high citric.acid will still get a low quality score. This shows that although both important it is actually more impactful to have lower volatile.acidity than having high citric.acid.
Per above chart, we see that positive correlation between fixed.acidity and citric.acid is consistent across all three categories of wine (bad, good, excellent). However, this relationship is more obvious with highest scoring wine category where fixed.acidity level increases along with citric.acid level increase.
Fixed.acidity and density relationship by wine category reveals that higher quality wines appear less dense compared to other categories for the same level of fixed.acidity.
I explored further some of the relationships explored in previous sections by adding quality varibles and seeing if the relationship holds true across all categories of wine.
Some interesting observations are:
1.Low citric.acid and high volatile.acidity results in lowest quality score, which is expected based on previous observations. However, wine with high volatile.acidity and high citric.acid will still get a low quality score (4). This shows that although both important it is actually more impactful to have lower volatile.acidity than having high citric.acid.
2. In citric.acid vs alcohol chart we notice that citric.acid impacts quality score more than the alcohol content. This we can see for quality score 5 which has higher alcohol content but less citric.acid vs quality score 7 which has lower alcohol content but higher citric.acid.
3. Alcohol vs volatile.acidity relationship related to quality #5 has high alcohol content and higher volatile.acidity results in lower quality score vs lower alcohol content and lower volatile.acidity. This means that volatile.acidity impacts quality score more than the alcohol content.
N/A
One of the highest correlations we have noted in the dataset is alcohol vs quality.
It appears that higher quality wines have higher percent of alcohol content.
Bivariate analysis performed above also showed that there is high positive correlation between quality and citric acid and negative correlation between quality and volatile acidity. Here I am plotting citric acid against volatile acidity by quality score. We can observe that low citric.acid and high volatile.acidity results in lowest quality score (3). However, wine with high volatile.acidity and high citric.acid will still get a low quality score (4). This shows that although both important it is actually more impactful to have lower volatile.acidity than having high citric.acid.
Our bivariate analysis show that there is negative correlation between density and alcohol which makes sense given that the density of water is close to that of water depending on the percent alcohol and sugar content.We also observe that there is positive correlation between fixed.acidity and citric.acid levels.
Therefore, I also want to see how the relationship between fixed.acidity and density relationship. Fixed.acidity and density relationship by wine category reveals that higher quality wines appear less dense compared to other categories for the same level of fixed.acidity.
——
There are 1,599 observations in the dataset with 12 variables. At least 3 wine experts rated the quality of each wine, providing a rating between 0 (very bad) and 10 (very excellent).
In univariate analysis I noted that our dataset shows that majority of wine are ranked in quality score of 5 and 6.
Comparing qualities of bad, good and excellent wines, volatile.acidity and citric.acid differs significantly for each rating and therefore hinting those are the qualities impacting the quality of wine. Another characteristic that would be interesting to explore is the level of alcohol and its impact on quality of wine.
Further, I explored bivariate relationships between different chemical components. In my analysis I explored relationships for those variables which appeared to have the highest correlation indices. There were some more obvious observations like negative correlation between density and alcohol and negative relationship between volatile.acidity and citric.acid.
There is positive correlation between fixed.acidity and citric.acid levels and citric.acid known to bring freshness and flavor to wine. While pH and fixed.acidity have high negative correlation. Wines with lower levels of fixed.acidity appear less acidic. Also, higher the level of fixed.acidity the higher is the level of density.
In multivariate analysis we observed that fixed.acidity and citric.acid relationship is consistent across all three categories of wine (bad,good,excellent).
Also, the higher alcohol and low volatile.acidity results in higher quality. High alcohol content and high citric.acid provide highest quality wine, while the low alcohol content and low citric.acid get the lowest quality score. Fixed.acidity and density relationship by wine category reveals that higher quality wines appear less dense compared to other categories for the same level of fixed.acidity.
You can find these and other observations described in the above sections.
What would be interesting to add in the dataset for the future is the region where wine is from.